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Abstract 

We introduce the well-tempered ensemble (WTE) which is the biased ensemble sampled by well- 
tempered metadynamics when the energy is used as collective variable. WTE can be designed 
so as to have approximately the same average energy as the canonical ensemble but much larger 
fluctuations. These two properties lead to an extremely fast exploration of phase space. An even 
greater efficiency is obtained when WTE is combined with parallel tempering. Unbiased Boltzmann 
averages are computed on the fly by a recently developed reweighting method [M. Bonomi et al. 
J. Comput. Chem. 30, 1615 (2009)]. We apply WTE and its parallel tempering variant to the 2d 
Ising model and to a Go-model of BIV protease, demonstrating in these two representative cases 
that convergence is accelerated by orders of magnitude. 
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Monte Carlo (MC) or molecular dynamics (MD) simulations are routinely applied in 
all areas of science. However, severe difficulties are encountered when multiple metastable 
states separated by large free-energy barriers are present. Nucleation from one phase to 
another, chemical reactions, and protein folding are important examples. Accessing the 
low probability regions separating one state from another can overcome this difficulty. In 
standard MC or MD this is not possible and the system remains confined to its initial basin 
hindering a proper phase space exploration. Sampling low probability regions would also 
be of great help in free-energy differences calculation Hence many enhanced sampling 



methods have been suggested [2H12]. 



Recently, we have developed metadynamics [1311 where few difficult to sample degrees of 
freedom or collective variables (CV) are selected [14|,L15|. If the CV are well chosen large free- 
energy barriers can be overcome and the associated free-energy surface (FES) reconstructed 



161 ] . Well-tempered metadynamics 



12| is a non-trivial evolution of the method which lends 



itself to reweighting thus allowing the calculation of unbiased canonical averages [18[. We 
show here that when the potential energy is used as CV a well definite distribution dubbed 
well-tempered ensemble (WTE) is sampled. Using WTE is possible to observe transitions 
between states that otherwise would have been impossible to study in standard MC or MD. 
Many approaches have been already suggested in which the energy distribution is altered 



artificially 
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-l24j. However, all these methods can evaluate only the density of states 



from which thermal properties can be determined. If information on other variables is 



needed for each new variable a separate calculation is required [20|, [25(. Here instead full 
information on all the variables distribution can be obtained from a single run. Furthermore 
in an appropriate combination with parallel tempering (PT) 26], we show that orders of 
magnitude sampling efficiency can be gained. 

Let us use as CV the potential energy U = U(R) where R is the full set of atomic 
coordinates. In well-tempered metadynamics the Newton's equations are altered by the 
addition of a bias potential V(U(R),t): 

dU(R) dV(U(R),t) 



mR 



whose time evolution is governed by: 



dR 



dR 



(1) 



. V(U,t) 

V(U,t)=ue ""^fy 



(2) 



where m are the atomic masses, while u and AT are parameters which have the dimension 
of an energy rate and a temperature respectively. Asymptotically, V(U,t): 

V(U,t^oo) = -(l- 1 - l )F(U), (3) 



with 7 = (T + AT)/T > 1 and F(U) = -\ In S -^f^m^-^ ■ Within an irrelevant 
constant, 

F(U) = U-P~ l \nN{U) (4) 



where N(U) =f dRS(U — U (R)) is the number of states of energy U, which is a T indepen- 
dent property p, [l9|, |20| . V(U, t) quickly converges to its t — Y oo limit and the configurations 
are distributed according to: 

Z, = J dRe-WW, (5) 

with 

U 7 (R) = U{R) - (1 -7" 1 ) [U{R) -r^nNiUiR))] , (6) 
which defines WTE. It is then easy to rewrite the partition function Z 7 as: 

Zy= f dUP(U)^, (7) 



where P(U) = e ^ u N{U) is proportional to the energy probability distribution in the canon- 
ical ensemble. Varying 7 one goes from the canonical partition function (7 = 1) to the 



multicanonical one (7 = 00) 27J]. In order to gain insight into the properties we make 



the assumption that P(U) is strictly Gaussian, P(U) oc e ^au 2 , where (U) is the aver- 
age energy in the canonical ensemble and AU 2 the corresponding fluctuation [28||. Thus 

1 (U~{U)f 

P{U)~< oc e 2 t ac/2 implying the same average energy as in the canonical case \U) 1 = (U) 
but 7 time larger fluctuations. The Gaussian assumption is not always justified, for instance 
when (U) is close to the ground state energy or to a critical point. Still for reasonably large 
7 one finds (JJ) 1 close to (U) with fluctuations which grow approximately linearly with 7. 
In a rather loose sense it is as if a quasi- critical behavior is induced at all temperatures. 
This similitude is further increased by the fact that dynamical correlations are slowed down. 
However, when 7 increases even further the non-Gaussian tails in P(U) are amplified until 
for 7—7-00 one reaches the multicanonical limit. 

We now combine WTE with PT (PT-WTE). In PT, n replicas of the system at the 
temperatures i = l,n are introduced and a MC procedure is used to attempt exchanging 



configurations between replicas. Colder replicas are prevented from being trapped in local 
minima by the exchange with the higher temperature ones. A figure of merit is the ability 
of a replica to diffuse across all range of $ and methods that speed up this diffusion have 
been suggested (see Ref. [29 1 and references within). Given the special properties of WTE, 
it is tempting to explore its performance when combined with PT since one expects that the 
enhanced energy fluctuations will greatly facilitate exchange processes. In addition, if one 
use the same 7 factor for all the the swapping probability in PT-WTE is determined by: 

A M = 7- 1 (A - mU(Ri) - Ufa)), (8) 

implying a factor 7 reduction relative to conventional PT (7 = 1). This is possibly the main 
result of this paper and shows why PT-WTE leads to fast diffusion across the 

We now present two representative applications of WTE and of PT-WTE to substantiate 
our claim. First we consider the performance of WTE in the single replica mode. We 
simulate the two dimensional ferromagnetic Ising model for which an exact solution exists 



30[ and on which a large number of methods have been tested 3l|, l32|] . The Hamiltonian for 
this model is: Ti = —JJ2<ij> SiSj. We put J = 1 and Si = ±1 are spins on a square lattice 
with side L. Periodic boundary conditions are applied and only first-neighbor interactions 
are included. In the ferromagnetic state standard MC explores only one magnetization 
direction (Fig[T]). WTE instead is able to sample either spin orientations overcoming the 
large free-energy barrier (~ llO/c^T) that separates these two equivalent states. It is also 
seen that while the average values of the magnetization is approximately correct (|M| ~ 1 in 
the ferromagnetic phase and M ~ in the paramagnetic one), the energy fluctuations grow 
with 7 (see Tabled]). For T>T C the Gaussian assumption is clearly justified since (t/) 7 and 
At/2/7 are approximately constant up to 7 ~ 100. For T<T C and up to 7 ~ 100, (t/) 7 is also 
little shifted. However, the non linear fluctuation growth signals deviations from Gaussian 
behavior due to the proximity to the energy lower bound. In both cases relaxation times 
grow linearly with 7 and do not outweigh the benefit of increased fluctuations. We expect a 
useful 7 to be of the order of 7 ~ /c^TAF/At/ 2 , where AF is the relevant barrier. As such, 
7 will be system and size dependent. 



Despite the fact that we have not attempted to optimize the replica distribution |31| . 
the use of WTE leads to a great improvement in efficiency when combined with PT. This 
is measured in terms of round-trip time t 7 , which is the time needed for a configuration in 



FIG. 1. WTE (black) compared to standard ensemble sampling (red) at two temperatures, below 
(Ti = 1.0) and above (T2 = 5.0) the critical temperature T c = 2.269. The unit on the x-axis is 10 3 
MC steps. Each MC move consists of a complete sweep of the L=20 site lattice. Gaussians of 0.1 
height and 5.0 width were deposited at each step. 

n 

the coldest replica to reach the hottest temperature and come back [31]. It can be seen in 
Fig. [2] that the speed-up grows almost linearly with 7 up to 7 ~ 30 for L = 10 and 7 ~ 100 



for L = 20, and is much larger than what reported by optimizing the distribution [3 If . 
Empirically, the ratio between the smallest energy difference between successive and the 
largest energy fluctuation measured in the unbiased ensemble provides a good estimate for 
the optimal 7. Above this value the speed-up ceases to be linear in 7 and the increased 
fluctuations and the reduction in acceptance ratio do not compensate the dynamical slowing 
down. 

As a further example of the power of PT-WTE, we show an application to the folding 



process of the monomer of HIV-1 protease. For this we use a Go-model [33] which has a 
transition at Tf ~ 80K. For this reason, simulations using straightforward PT give poor 



results unless the distribution of temperatures across Tf is optimized [3l|. In this example, 
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TABLE I. Average value, fluctuation and correlation time of the energy in WTE as a function of 
7 at the two representative temperatures, below and above T c . The value of r/7 at 7 = 1, T = T\ 
is smaller than a single sweep. 



we do not use the potential energy as CV, but the variable on which the energy uniquely 
depends, namely the total number of native contacts between C a atoms. It is easy to show 
that in this case an expression equivalent to Eq. [S] holds. Simulations have been carried out 



using GROMACS |34j and PLUMED |35|. In this case £i/t 7 ~ 66. We also measure the 
speed-up in terms of MD steps needed to converge the free-energy difference between folded 
and unfolded state. In Fig. H]we see that PT-WTE converges in less than 2.5 • 10 7 steps, 
while standard PT is still not converged after 2.4 • 10 8 steps. We also show that allowing 
for replicas to exchange is crucial since WTE alone fails to converge in the simulation time. 
As a further check we reconstruct the thermodynamics of three relevant sub-units of HIV-1 
protease (Fig. [5]). Comparing our results with an umbrella sampling calculation that uses a 
posteriori the PT-WTE bias, we find an excellent agreement. 

In conclusion, we have shown that WTE can be profitably used as a biased ensemble to 
greatly enhance sampling speed especially when associated to parallel tempering. Properly 
designed WTE combines two properties that are useful in this respect. The fact that av- 
erage values are not changed ensures a significant overlap between the biased and unbiased 
ensemble facilitating the reconstruction of the latter. Yet the enhanced fluctuations favor 
exploring low probability regions and overcoming large barriers. Measurin g th e efficiency 
of this new method is a subtle question. We can claim on the basis of Ref. JiJ] that when 
it comes to reconstructing N(U) we can obtain an efficiency at least comparable to Wang- 
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FIG. 2. Speed-up of PT-WTE compared to standard PT as a function of 7 in the Ising model with 



L=10 (left panel) and L=20 (right pane 
in the interval 0.1<T<10.0 as in Ref. 
sweep. Gaussian parameters as in Fig. [TJ 



}. 21 replicas were distributed in a geometric progression 



3l| . Exchange moves were attempted after every lattice 



Landau. Furthermore, we have the additional bonus that we do not need extra calculations 
or expensive reconstruction of multidimensional histograms to evaluate quantities different 
from the energy or its fluctuations. In this respect the fair comparison is with PT where 
we gain relative to Ref. 3l| as much as a factor of ~ 100 on the Ising model with L = 20. 
Much remains to be done to understand WTE properties and to optimize its performances. 
However, the very encouraging results obtained at these early stages suggest that a powerful 
method has been added to the literature and that exciting applications can be expected. 
Extension of the method in which additional CV are added to U is straightforward and will 
be explored in the near future. 

We would like to thank Michele Ceriotti and Alessandro Barducci for fruitful discussions. 
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FIG. 3. Left panel. FES as a function of the magnetization F(M) of the L=10 next neighbor 
ferromagnetic Ising model below and above the critical temperature, compared with an extensive 
PT calculation. The statistical error for the PT-WTE calculations is smaller than 2 %. We have 
computed a similar curve for L=20 but we do not show it here because the PT calculation to 
compare with could not be converged. It is remarkable that both magnetization could be explored 
in spite of a barrier of the order of llO/c^T. Right panels. Specific heat per spin (top), modulus 
of the magnetization (middle) and magnetic susceptibility (bottom) as a function of temperature 



301 ] . In the middle and 



(L=20). The continuos line in the top panel is the finite size exact solution 
bottom panel the line is just a guide to the eye. The statistical error found is at worst 1 % in all 
cases. 



Calculations have been carried out on the BRUTUS cluster at ETH Zurich. 
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